Automated Tools for Subject Matter Expert Evaluation of Automated Scoring

نویسندگان

  • David M. Williamson
  • Isaac I. Bejar
  • Anne Sax
چکیده

As automated scoring of complex constructed-response examinations reaches operational status, the process of evaluating the quality of resultant scores, particularly in contrast to scores of expert human graders, becomes as complex as the data itself. Using a vignette from the Architectural Registration Examination (ARE), this paper explores the potential utility of classification and regression trees (CART) and Kohonen self-organizing maps (SOM) as tools to facilitate subject matter expert (SME) examination of the fine-grained (feature level) quality of automated scores for complex data, with implications for the validity of the resultant scores. The paper explores both supervised and unsupervised learning techniques, the former being represented by CART (Breiman, Friedman, Olshen, & Stone, 1984) and the latter by SOM (Kohonen, 1989). Three applications comprise this investigation, the first of which suggests that CART can facilitate efficient and economical identification of specific elements of complex solutions that contribute to automated and human score discrepancies. The second application builds on the first by exploring CART’s value for efficiently and accurately automating case selection for human intervention to ensure score validity. The final application explores the potential for SOM to reduce the need for SMEs in evaluating automated scoring. While both supervised and unsupervised methodologies examined were found to be promising tools for facilitating SME roles in maintaining and improving the quality of automated scoring, such applications remain unproven and further studies are necessary to establish the reliability of these techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-quantitative segmental perfusion scoring in myocardial perfusion SPECT: visual vs. automated analysis

Introduction: It is recommended that the physician apply at least a semi-quantitative segmental scoring system in myocardial perfusion SPECT.  We aimed to assess the agreement between automated semi-quantitative analysis using QPS (quantitative Perfusion SPECT) software and visual approach for calculation of summed stress  score (SSS), summed rest score (SRS) and summed difference score (SDS). ...

متن کامل

An Evaluation of IntelliMetricTM Essay Scoring System Using Responses to GMAT® AWA Prompts

The Graduate Management Admission Council® (GMAC®) has long benefited from advances in automated essay scoring. When GMAC® adopted ETS® e-rater® in 1999, the Council’s flagship product, the Graduate Management Admission Test® (GMAT®), became the first large-scale assessment to incorporate automated essay scoring. The change was controversial at the time (Iowa State Daily, 1999; Calfee, 2000). T...

متن کامل

Accuracy and efficiency of an automated system for calculating APACHE II scores in an intensive care unit

We evaluated the reliability and efficiency of an automated system for calculating APACHE II scores. We imported an automated APACHE II scoring system developed at another institution. We scored a convenience sample of 50 consecutive intensive care unit (ICU) admissions using three methods: (1) the automated system (2) an expert scorer using a manual data abstraction method, and (3) the current...

متن کامل

Automatic sleep spindle detection: benchmarking with fine temporal resolution using open science tools

Sleep spindle properties index cognitive faculties such as memory consolidation and diseases such as major depression. For this reason, scoring sleep spindle properties in polysomnographic recordings has become an important activity in both research and clinical settings. The tediousness of this manual task has motivated efforts for its automation. Although some progress has been made, increasi...

متن کامل

Cost Function Modelling for Semi-automated SC, RTG and Automated and Semi-automated RMG Container Yard Operating Systems

This study analyses the concept of cost functions for semi-automated Straddle Carrier (SC), Rubber Tyred Gantry (RTG) and automated Rail Mounted Gantry (RMG) container yard operating cranes. It develops a generic cost based model for a pair-wise comparison, analysis and evaluation of economic efficiency and effectiveness of container yard equipment to be used for decision-making by terminal pla...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004